自动伪标记是一种强大的工具,可以利用大量的连续未标记数据。在绩效要求非常大,数据集和手动标记的自动驾驶的关键安全应用中,它特别有吸引力。我们建议利用捕获的顺序性,通过培训多个教师在教师的设置中提高伪标记技术,每个教师都可以访问不同的时间信息。这套被称为一致性的教师比标准方法为学生培训提供了更高质量的伪标签。多个教师的输出通过新颖的伪标记信心引导的标准组合。我们的实验评估集中在城市驾驶场景中的3D点云域。我们显示了我们的方法的性能,应用于多个模型体系结构,其中包含3D语义分割任务和两个基准数据集上的3D对象检测。我们的方法仅使用20%的手动标签,优于某些完全监督的方法。对于培训数据,例如自行车和行人,很少出现在培训数据中的课程方面的特殊表现提升。我们的方法的实现可在https://github.com/ctu-vras/t-concord3d上公开获得。
translated by 谷歌翻译
使用3D激光点云数据的对象检测和语义分割需要昂贵的注释。我们提出了一种数据增强方法,该方法多次利用已经注释的数据。我们提出了一个重用真实数据的增强框架,自动在场景中找到合适的位置要增加,并明确地处理遮挡。由于使用真实数据,新插入的物体在增强中的扫描点维持了激光雷达的物理特征,例如强度和射线表。该管道证明在训练3D对象检测和语义分割的最佳模型中具有竞争力。新的增强为稀有和基本类别提供了显着的性能增长,尤其是在Kitti对象检测中“硬”行人级的平均精度增益为6.65%,或者2.14表示在Semantickitti细分挑战中获得的iOU在艺术状态下的增益。
translated by 谷歌翻译
本文提供了一个完整的管道,用于学习移动机器人的连续运动控制策略,只有可用的机器人 - 泰林相互作用的非差异物理模拟器才能提供。机器人的多模式状态估计也很复杂且难以模拟,因此我们同时学习了一个生成模型,该模型可以完善模拟器输出。我们提出了一个粗到精细的学习范式,其中粗略的运动计划与模仿学习和政策转移到真正的机器人。该政策通过生成模型共同优化。我们在一批实验中评估了现实世界平台上的方法。
translated by 谷歌翻译
本文提出了一种新颖的技术,该技术允许对具有不可构造轨道的车辆进行计算快速且足够合理的模拟。该方法基于我们称为接触表面运动的效果。提出了与其他几种模拟轨道车辆动力学模拟的方法的比较,目的是评估现成的方法或在通用机器人模拟器中使用最少努力的方法。提出的方法是使用开放动力学引擎的开源物理模拟器凉亭实现的。
translated by 谷歌翻译
Independent Component Analysis (ICA) is an algorithm originally developed for finding separate sources in a mixed signal, such as a recording of multiple people in the same room speaking at the same time. It has also been used to find linguistic features in distributional representations. In this paper, we used ICA to analyze words embeddings. We have found that ICA can be used to find semantic features of the words and these features can easily be combined to search for words that satisfy the combination. We show that only some of the independent components represent such features, but those that do are stable with regard to random initialization of the algorithm.
translated by 谷歌翻译
This paper presents a conversational AI platform called Flowstorm. Flowstorm is an open-source SaaS project suitable for creating, running, and analyzing conversational applications. Thanks to the fast and fully automated build process, the dialogues created within the platform can be executed in seconds. Furthermore, we propose a novel dialogue architecture that uses a combination of tree structures with generative models. The tree structures are also used for training NLU models suitable for specific dialogue scenarios. However, the generative models are globally used across applications and extend the functionality of the dialogue trees. Moreover, the platform functionality benefits from out-of-the-box components, such as the one responsible for extracting data from utterances or working with crawled data. Additionally, it can be extended using a custom code directly in the platform. One of the essential features of the platform is the possibility to reuse the created assets across applications. There is a library of prepared assets where each developer can contribute. All of the features are available through a user-friendly visual editor.
translated by 谷歌翻译
Within an operational framework, covers used by a steganographer are likely to come from different sensors and different processing pipelines than the ones used by researchers for training their steganalysis models. Thus, a performance gap is unavoidable when it comes to out-of-distributions covers, an extremely frequent scenario called Cover Source Mismatch (CSM). Here, we explore a grid of processing pipelines to study the origins of CSM, to better understand it, and to better tackle it. A set-covering greedy algorithm is used to select representative pipelines minimizing the maximum regret between the representative and the pipelines within the set. Our main contribution is a methodology for generating relevant bases able to tackle operational CSM. Experimental validation highlights that, for a given number of training samples, our set covering selection is a better strategy than selecting random pipelines or using all the available pipelines. Our analysis also shows that parameters as denoising, sharpening, and downsampling are very important to foster diversity. Finally, different benchmarks for classical and wild databases show the good generalization property of the extracted databases. Additional resources are available at github.com/RonyAbecidan/HolisticSteganalysisWithSetCovering.
translated by 谷歌翻译
我们介绍了AARGH,这是一个面向任务的对话框系统,该系统结合了单个模型中的检索和生成方法,旨在改善对话框管理和输出的词汇多样性。该模型采用了一种新的响应选择方法,该方法基于动作感知训练目标和简化的单编码检索架构,该方法使我们能够构建端到端检索增强生成模型,在该模型中,检索和生成共享大多数参数。在Multiwoz数据集上,我们表明我们的方法与最先进的基线相比,在维持或改善状态跟踪和上下文响应生成性能的同时,产生了更多的输出。
translated by 谷歌翻译
时间序列的异常提供了各个行业的关键方案的见解,从银行和航空航天到信息技术,安全和医学。但是,由于异常的定义,经常缺乏标签以及此类数据中存在的极为复杂的时间相关性,因此识别时间序列数据中的异常尤其具有挑战性。LSTM自动编码器是基于长期短期内存网络的异常检测的编码器传统方案,该方案学会重建时间序列行为,然后使用重建错误来识别异常。我们将Denoising Architecture作为对该LSTM编码模型模型的补充,并研究其对现实世界以及人为生成的数据集的影响。我们证明了所提出的体系结构既提高了准确性和训练速度,从而使LSTM自动编码器更有效地用于无监督的异常检测任务。
translated by 谷歌翻译
在非结构化环境中工作的机器人必须能够感知和解释其周围环境。机器人技术领域基于深度学习模型的主要障碍之一是缺乏针对不同工业应用的特定领域标记数据。在本文中,我们提出了一种基于域随机化的SIM2REAL传输学习方法,用于对象检测,可以自动生成任意大小和对象类型的标记的合成数据集。随后,对最先进的卷积神经网络Yolov4进行了训练,以检测不同类型的工业对象。通过提出的域随机化方法,我们可以在零射击和单次转移的情况下分别缩小现实差距,分别达到86.32%和97.38%的MAP50分数,其中包含190个真实图像。在GEFORCE RTX 2080 TI GPU上,数据生成过程的每图像少于0.5 s,培训持续约12H,这使其方便地用于工业使用。我们的解决方案符合工业需求,因为它可以通过仅使用1个真实图像进行培训来可靠地区分相似的对象类别。据我们所知,这是迄今为止满足这些约束的唯一工作。
translated by 谷歌翻译